home *** CD-ROM | disk | FTP | other *** search
- <?xml version="1.0" encoding="UTF-8" ?>
- <!DOCTYPE manualpage SYSTEM "./style/manualpage.dtd">
- <?xml-stylesheet type="text/xsl" href="./style/manual.en.xsl"?>
- <!-- $Revision: 1.6.2.8 $ -->
-
- <!--
- Copyright 2002-2004 The Apache Software Foundation
-
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
- -->
-
- <manualpage metafile="logs.xml.meta">
-
- <title>Log Files</title>
-
- <summary>
- <p>In order to effectively manage a web server, it is necessary
- to get feedback about the activity and performance of the
- server as well as any problems that may be occurring. The Apache
- HTTP Server provides very comprehensive and flexible logging
- capabilities. This document describes how to configure its
- logging capabilities, and how to understand what the logs
- contain.</p>
- </summary>
-
- <section id="security">
- <title>Security Warning</title>
-
- <p>Anyone who can write to the directory where Apache is
- writing a log file can almost certainly gain access to the uid
- that the server is started as, which is normally root. Do
- <em>NOT</em> give people write access to the directory the logs
- are stored in without being aware of the consequences; see the
- <a href="misc/security_tips.html">security tips</a> document
- for details.</p>
-
- <p>In addition, log files may contain information supplied
- directly by the client, without escaping. Therefore, it is
- possible for malicious clients to insert control-characters in
- the log files, so care must be taken in dealing with raw
- logs.</p>
- </section>
-
- <section id="errorlog">
- <title>Error Log</title>
-
- <related>
- <directivelist>
- <directive module="core">ErrorLog</directive>
- <directive module="core">LogLevel</directive>
- </directivelist>
- </related>
-
- <p>The server error log, whose name and location is set by the
- <directive module="core">ErrorLog</directive> directive, is the
- most important log file. This is the place where Apache httpd
- will send diagnostic information and record any errors that it
- encounters in processing requests. It is the first place to
- look when a problem occurs with starting the server or with the
- operation of the server, since it will often contain details of
- what went wrong and how to fix it.</p>
-
- <p>The error log is usually written to a file (typically
- <code>error_log</code> on unix systems and
- <code>error.log</code> on Windows and OS/2). On unix systems it
- is also possible to have the server send errors to
- <code>syslog</code> or <a href="#piped">pipe them to a
- program</a>.</p>
-
- <p>The format of the error log is relatively free-form and
- descriptive. But there is certain information that is contained
- in most error log entries. For example, here is a typical
- message.</p>
-
- <example>
- [Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1]
- client denied by server configuration:
- /export/home/live/ap/htdocs/test
- </example>
-
- <p>The first item in the log entry is the date and time of the
- message. The second entry lists the severity of the error being
- reported. The <directive module="core">LogLevel</directive>
- directive is used to control the types of errors that are sent
- to the error log by restricting the severity level. The third
- entry gives the IP address of the client that generated the
- error. Beyond that is the message itself, which in this case
- indicates that the server has been configured to deny the
- client access. The server reports the file-system path (as
- opposed to the web path) of the requested document.</p>
-
- <p>A very wide variety of different messages can appear in the
- error log. Most look similar to the example above. The error
- log will also contain debugging output from CGI scripts. Any
- information written to <code>stderr</code> by a CGI script will
- be copied directly to the error log.</p>
-
- <p>It is not possible to customize the error log by adding or
- removing information. However, error log entries dealing with
- particular requests have corresponding entries in the <a
- href="#accesslog">access log</a>. For example, the above example
- entry corresponds to an access log entry with status code 403.
- Since it is possible to customize the access log, you can
- obtain more information about error conditions using that log
- file.</p>
-
- <p>During testing, it is often useful to continuously monitor
- the error log for any problems. On unix systems, you can
- accomplish this using:</p>
-
- <example>
- tail -f error_log
- </example>
- </section>
-
- <section id="accesslog">
- <title>Access Log</title>
-
- <related>
- <modulelist>
- <module>mod_log_config</module>
- <module>mod_setenvif</module>
- </modulelist>
- <directivelist>
- <directive module="mod_log_config">CustomLog</directive>
- <directive module="mod_log_config">LogFormat</directive>
- <directive module="mod_setenvif">SetEnvIf</directive>
- </directivelist>
- </related>
-
- <p>The server access log records all requests processed by the
- server. The location and content of the access log are
- controlled by the <directive module="mod_log_config">CustomLog</directive>
- directive. The <directive module="mod_log_config">LogFormat</directive>
- directive can be used to simplify the selection of
- the contents of the logs. This section describes how to configure the server
- to record information in the access log.</p>
-
- <p>Of course, storing the information in the access log is only
- the start of log management. The next step is to analyze this
- information to produce useful statistics. Log analysis in
- general is beyond the scope of this document, and not really
- part of the job of the web server itself. For more information
- about this topic, and for applications which perform log
- analysis, check the <a
- href="http://dmoz.org/Computers/Software/Internet/Site_Management/Log_analysis/">
- Open Directory</a> or <a
- href="http://dir.yahoo.com/Computers_and_Internet/Software/Internet/World_Wide_Web/Servers/Log_Analysis_Tools/">
- Yahoo</a>.</p>
-
- <p>Various versions of Apache httpd have used other modules and
- directives to control access logging, including
- mod_log_referer, mod_log_agent, and the
- <code>TransferLog</code> directive. The <directive
- module="mod_log_config">CustomLog</directive> directive now subsumes
- the functionality of all the older directives.</p>
-
- <p>The format of the access log is highly configurable. The format
- is specified using a format string that looks much like a C-style
- printf(1) format string. Some examples are presented in the next
- sections. For a complete list of the possible contents of the
- format string, see the <module>mod_log_config</module> <a
- href="mod/mod_log_config.html#formats">format strings</a>.</p>
-
- <section id="common">
- <title>Common Log Format</title>
-
- <p>A typical configuration for the access log might look as
- follows.</p>
-
- <example>
- LogFormat "%h %l %u %t \"%r\" %>s %b" common<br />
- CustomLog logs/access_log common
- </example>
-
- <p>This defines the <em>nickname</em> <code>common</code> and
- associates it with a particular log format string. The format
- string consists of percent directives, each of which tell the
- server to log a particular piece of information. Literal
- characters may also be placed in the format string and will be
- copied directly into the log output. The quote character
- (<code>"</code>) must be escaped by placing a back-slash before
- it to prevent it from being interpreted as the end of the
- format string. The format string may also contain the special
- control characters "<code>\n</code>" for new-line and
- "<code>\t</code>" for tab.</p>
-
- <p>The <directive module="mod_log_config">CustomLog</directive>
- directive sets up a new log file using the defined
- <em>nickname</em>. The filename for the access log is relative to
- the <directive module="core">ServerRoot</directive> unless it
- begins with a slash.</p>
-
- <p>The above configuration will write log entries in a format
- known as the Common Log Format (CLF). This standard format can
- be produced by many different web servers and read by many log
- analysis programs. The log file entries produced in CLF will
- look something like this:</p>
-
- <example>
- 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET
- /apache_pb.gif HTTP/1.0" 200 2326
- </example>
-
- <p>Each part of this log entry is described below.</p>
-
- <dl>
- <dt><code>127.0.0.1</code> (<code>%h</code>)</dt>
-
- <dd>This is the IP address of the client (remote host) which
- made the request to the server. If <directive
- module="core">HostnameLookups</directive> is
- set to <code>On</code>, then the server will try to determine
- the hostname and log it in place of the IP address. However,
- this configuration is not recommended since it can
- significantly slow the server. Instead, it is best to use a
- log post-processor such as <a
- href="programs/logresolve.html">logresolve</a> to determine
- the hostnames. The IP address reported here is not
- necessarily the address of the machine at which the user is
- sitting. If a proxy server exists between the user and the
- server, this address will be the address of the proxy, rather
- than the originating machine.</dd>
-
- <dt><code>-</code> (<code>%l</code>)</dt>
-
- <dd>The "hyphen" in the output indicates that the requested
- piece of information is not available. In this case, the
- information that is not available is the RFC 1413 identity of
- the client determined by <code>identd</code> on the clients
- machine. This information is highly unreliable and should
- almost never be used except on tightly controlled internal
- networks. Apache httpd will not even attempt to determine
- this information unless <directive
- module="core">IdentityCheck</directive> is set
- to <code>On</code>.</dd>
-
- <dt><code>frank</code> (<code>%u</code>)</dt>
-
- <dd>This is the userid of the person requesting the document
- as determined by HTTP authentication. The same value is
- typically provided to CGI scripts in the
- <code>REMOTE_USER</code> environment variable. If the status
- code for the request (see below) is 401, then this value
- should not be trusted because the user is not yet
- authenticated. If the document is not password protected,
- this entry will be "<code>-</code>" just like the previous
- one.</dd>
-
- <dt><code>[10/Oct/2000:13:55:36 -0700]</code>
- (<code>%t</code>)</dt>
-
- <dd>
- The time that the server finished processing the request.
- The format is:
-
- <p class="indent">
- <code>[day/month/year:hour:minute:second zone]<br />
- day = 2*digit<br />
- month = 3*letter<br />
- year = 4*digit<br />
- hour = 2*digit<br />
- minute = 2*digit<br />
- second = 2*digit<br />
- zone = (`+' | `-') 4*digit</code>
- </p>
- It is possible to have the time displayed in another format
- by specifying <code>%{format}t</code> in the log format
- string, where <code>format</code> is as in
- <code>strftime(3)</code> from the C standard library.
- </dd>
-
- <dt><code>"GET /apache_pb.gif HTTP/1.0"</code>
- (<code>\"%r\"</code>)</dt>
-
- <dd>The request line from the client is given in double
- quotes. The request line contains a great deal of useful
- information. First, the method used by the client is
- <code>GET</code>. Second, the client requested the resource
- <code>/apache_pb.gif</code>, and third, the client used the
- protocol <code>HTTP/1.0</code>. It is also possible to log
- one or more parts of the request line independently. For
- example, the format string "<code>%m %U%q %H</code>" will log
- the method, path, query-string, and protocol, resulting in
- exactly the same output as "<code>%r</code>".</dd>
-
- <dt><code>200</code> (<code>%>s</code>)</dt>
-
- <dd>This is the status code that the server sends back to the
- client. This information is very valuable, because it reveals
- whether the request resulted in a successful response (codes
- beginning in 2), a redirection (codes beginning in 3), an
- error caused by the client (codes beginning in 4), or an
- error in the server (codes beginning in 5). The full list of
- possible status codes can be found in the <a
- href="http://www.w3.org/Protocols/rfc2616/rfc2616.txt">HTTP
- specification</a> (RFC2616 section 10).</dd>
-
- <dt><code>2326</code> (<code>%b</code>)</dt>
-
- <dd>The last entry indicates the size of the object returned
- to the client, not including the response headers. If no
- content was returned to the client, this value will be
- "<code>-</code>". To log "<code>0</code>" for no content, use
- <code>%B</code> instead.</dd>
- </dl>
- </section>
-
- <section id="combined">
- <title>Combined Log Format</title>
-
- <p>Another commonly used format string is called the Combined
- Log Format. It can be used as follows.</p>
-
- <example>
- LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\"
- \"%{User-agent}i\"" combined<br />
- CustomLog log/access_log combined
- </example>
-
- <p>This format is exactly the same as the Common Log Format,
- with the addition of two more fields. Each of the additional
- fields uses the percent-directive
- <code>%{<em>header</em>}i</code>, where <em>header</em> can be
- any HTTP request header. The access log under this format will
- look like:</p>
-
- <example>
- 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET
- /apache_pb.gif HTTP/1.0" 200 2326
- "http://www.example.com/start.html" "Mozilla/4.08 [en]
- (Win98; I ;Nav)"
- </example>
-
- <p>The additional fields are:</p>
-
- <dl>
- <dt><code>"http://www.example.com/start.html"</code>
- (<code>\"%{Referer}i\"</code>)</dt>
-
- <dd>The "Referer" (sic) HTTP request header. This gives the
- site that the client reports having been referred from. (This
- should be the page that links to or includes
- <code>/apache_pb.gif</code>).</dd>
-
- <dt><code>"Mozilla/4.08 [en] (Win98; I ;Nav)"</code>
- (<code>\"%{User-agent}i\"</code>)</dt>
-
- <dd>The User-Agent HTTP request header. This is the
- identifying information that the client browser reports about
- itself.</dd>
- </dl>
- </section>
-
- <section id="multiple">
- <title>Multiple Access Logs</title>
-
- <p>Multiple access logs can be created simply by specifying
- multiple <directive module="mod_log_config">CustomLog</directive>
- directives in the configuration
- file. For example, the following directives will create three
- access logs. The first contains the basic CLF information,
- while the second and third contain referer and browser
- information. The last two <directive
- module="mod_log_config">CustomLog</directive> lines show how
- to mimic the effects of the <code>ReferLog</code> and <code
- >AgentLog</code> directives.</p>
-
- <example>
- LogFormat "%h %l %u %t \"%r\" %>s %b" common<br />
- CustomLog logs/access_log common<br />
- CustomLog logs/referer_log "%{Referer}i -> %U"<br />
- CustomLog logs/agent_log "%{User-agent}i"
- </example>
-
- <p>This example also shows that it is not necessary to define a
- nickname with the <directive
- module="mod_log_config">LogFormat</directive> directive. Instead,
- the log format can be specified directly in the <directive
- module="mod_log_config">CustomLog</directive> directive.</p>
- </section>
-
- <section id="conditional">
- <title>Conditional Logs</title>
-
- <p>There are times when it is convenient to exclude certain
- entries from the access logs based on characteristics of the
- client request. This is easily accomplished with the help of <a
- href="env.html">environment variables</a>. First, an
- environment variable must be set to indicate that the request
- meets certain conditions. This is usually accomplished with
- <directive module="mod_setenvif">SetEnvIf</directive>. Then the
- <code>env=</code> clause of the <directive
- module="mod_log_config">CustomLog</directive> directive is used to
- include or exclude requests where the environment variable is
- set. Some examples:</p>
-
- <example>
- # Mark requests from the loop-back interface<br />
- SetEnvIf Remote_Addr "127\.0\.0\.1" dontlog<br />
- # Mark requests for the robots.txt file<br />
- SetEnvIf Request_URI "^/robots\.txt$" dontlog<br />
- # Log what remains<br />
- CustomLog logs/access_log common env=!dontlog
- </example>
-
- <p>As another example, consider logging requests from
- english-speakers to one log file, and non-english speakers to a
- different log file.</p>
-
- <example>
- SetEnvIf Accept-Language "en" english<br />
- CustomLog logs/english_log common env=english<br />
- CustomLog logs/non_english_log common env=!english
- </example>
-
- <p>Although we have just shown that conditional logging is very
- powerful and flexibly, it is not the only way to control the
- contents of the logs. Log files are more useful when they
- contain a complete record of server activity. It is often
- easier to simply post-process the log files to remove requests
- that you do not want to consider.</p>
- </section>
- </section>
-
- <section id="rotation">
- <title>Log Rotation</title>
-
- <p>On even a moderately busy server, the quantity of
- information stored in the log files is very large. The access
- log file typically grows 1 MB or more per 10,000 requests. It
- will consequently be necessary to periodically rotate the log
- files by moving or deleting the existing logs. This cannot be
- done while the server is running, because Apache will continue
- writing to the old log file as long as it holds the file open.
- Instead, the server must be <a
- href="stopping.html">restarted</a> after the log files are
- moved or deleted so that it will open new log files.</p>
-
- <p>By using a <em>graceful</em> restart, the server can be
- instructed to open new log files without losing any existing or
- pending connections from clients. However, in order to
- accomplish this, the server must continue to write to the old
- log files while it finishes serving old requests. It is
- therefore necessary to wait for some time after the restart
- before doing any processing on the log files. A typical
- scenario that simply rotates the logs and compresses the old
- logs to save space is:</p>
-
- <example>
- mv access_log access_log.old<br />
- mv error_log error_log.old<br />
- apachectl graceful<br />
- sleep 600<br />
- gzip access_log.old error_log.old
- </example>
-
- <p>Another way to perform log rotation is using <a
- href="#piped">piped logs</a> as discussed in the next
- section.</p>
- </section>
-
- <section id="piped">
- <title>Piped Logs</title>
-
- <p>Apache httpd is capable of writing error and access log
- files through a pipe to another process, rather than directly
- to a file. This capability dramatically increases the
- flexibility of logging, without adding code to the main server.
- In order to write logs to a pipe, simply replace the filename
- with the pipe character "<code>|</code>", followed by the name
- of the executable which should accept log entries on its
- standard input. Apache will start the piped-log process when
- the server starts, and will restart it if it crashes while the
- server is running. (This last feature is why we can refer to
- this technique as "reliable piped logging".)</p>
-
- <p>Piped log processes are spawned by the parent Apache httpd
- process, and inherit the userid of that process. This means
- that piped log programs usually run as root. It is therefore
- very important to keep the programs simple and secure.</p>
-
- <p>One important use of piped logs is to allow log rotation
- without having to restart the server. The Apache HTTP Server
- includes a simple program called <a
- href="programs/rotatelogs.html">rotatelogs</a> for this
- purpose. For example, to rotate the logs every 24 hours, you
- can use:</p>
-
- <example>
- CustomLog "|/usr/local/apache/bin/rotatelogs
- /var/log/access_log 86400" common
- </example>
-
- <p>Notice that quotes are used to enclose the entire command
- that will be called for the pipe. Although these examples are
- for the access log, the same technique can be used for the
- error log.</p>
-
- <p>A similar but much more flexible log rotation program
- called <a href="http://www.cronolog.org/">cronolog</a>
- is available at an external site.</p>
-
- <p>As with conditional logging, piped logs are a very powerful
- tool, but they should not be used where a simpler solution like
- off-line post-processing is available.</p>
- </section>
-
- <section id="virtualhost">
- <title>Virtual Hosts</title>
-
- <p>When running a server with many <a href="vhosts/">virtual
- hosts</a>, there are several options for dealing with log
- files. First, it is possible to use logs exactly as in a
- single-host server. Simply by placing the logging directives
- outside the <directive module="core"
- type="section">VirtualHost</directive> sections in the
- main server context, it is possible to log all requests in the
- same access log and error log. This technique does not allow
- for easy collection of statistics on individual virtual
- hosts.</p>
-
- <p>If <directive module="mod_log_config">CustomLog</directive>
- or <directive module="core">ErrorLog</directive>
- directives are placed inside a
- <directive module="core" type="section">VirtualHost</directive>
- section, all requests or errors for that virtual host will be
- logged only to the specified file. Any virtual host which does
- not have logging directives will still have its requests sent
- to the main server logs. This technique is very useful for a
- small number of virtual hosts, but if the number of hosts is
- very large, it can be complicated to manage. In addition, it
- can often create problems with <a
- href="vhosts/fd-limits.html">insufficient file
- descriptors</a>.</p>
-
- <p>For the access log, there is a very good compromise. By
- adding information on the virtual host to the log format
- string, it is possible to log all hosts to the same log, and
- later split the log into individual files. For example,
- consider the following directives.</p>
-
- <example>
- LogFormat "%v %l %u %t \"%r\" %>s %b"
- comonvhost<br />
- CustomLog logs/access_log comonvhost
- </example>
-
- <p>The <code>%v</code> is used to log the name of the virtual
- host that is serving the request. Then a program like <a
- href="programs/other.html">split-logfile</a> can be used to
- post-process the access log in order to split it into one file
- per virtual host.</p>
- </section>
-
- <section id="other">
- <title>Other Log Files</title>
-
- <related>
- <modulelist>
- <module>mod_cgi</module>
- <module>mod_rewrite</module>
- </modulelist>
- <directivelist>
- <directive module="mpm_common">PidFile</directive>
- <directive module="mod_rewrite">RewriteLog</directive>
- <directive module="mod_rewrite">RewriteLogLevel</directive>
- <directive module="mod_cgi">ScriptLog</directive>
- <directive module="mod_cgi">ScriptLogBuffer</directive>
- <directive module="mod_cgi">ScriptLogLength</directive>
- </directivelist>
- </related>
-
- <section id="pidfile">
- <title>PID File</title>
-
- <p>On startup, Apache httpd saves the process id of the parent
- httpd process to the file <code>logs/httpd.pid</code>. This
- filename can be changed with the <directive
- module="mpm_common">PidFile</directive> directive. The
- process-id is for use by the administrator in restarting and
- terminating the daemon by sending signals to the parent
- process; on Windows, use the -k command line option instead.
- For more information see the <a href="stopping.html">Stopping
- and Restarting</a> page.</p>
- </section>
-
- <section id="scriptlog">
- <title>Script Log</title>
-
- <p>In order to aid in debugging, the
- <directive module="mod_cgi">ScriptLog</directive> directive
- allows you to record the input to and output from CGI scripts.
- This should only be used in testing - not for live servers.
- More information is available in the <a
- href="mod/mod_cgi.html">mod_cgi</a> documentation.</p>
- </section>
-
- <section id="rewritelog">
- <title>Rewrite Log</title>
-
- <p>When using the powerful and complex features of <a
- href="mod/mod_rewrite.html">mod_rewrite</a>, it is almost
- always necessary to use the <directive
- module="mod_rewrite">RewriteLog</directive> to help
- in debugging. This log file produces a detailed analysis of how
- the rewriting engine transforms requests. The level of detail
- is controlled by the <directive
- module="mod_rewrite">RewriteLogLevel</directive> directive.</p>
- </section>
- </section>
- </manualpage>
-
-
-
-
-